A new pitch synchronous time domain phoneme recognizer using component analysis and pitch clustering
نویسندگان
چکیده
A new framework for time domain voiced phoneme recognition is shown. Each speech frame taken for training and recognition is bounded by consecutive glottal closures. A preprocessing stage is designed and implemented to model pitch synchronous frames with gaussian mixture models. Component analysis carried out on the data shows optimal performance with a very small number of components, requiring low computational power. We designed a new clustering technique that, using the pitch period, gives better results than other well known clustering algorithms like k-means.
منابع مشابه
Text-to-Speech Synthesis using Phoneme Concatenation
We proposed Text-To-Speech (TTS) synthesis system based on phonetic concatenation for unrestricted input text. The input text is first converted into phonetic transcription using Letter-to-Sound rules. For synthesis of a new speech, TTS system selects the recorded phoneme units (PUs) from database and modifies the duration according to the rule based on spelling using Time Domain Pitch Synchron...
متن کاملAn Intonational Phrase Boundary and Pitch Accent Dependent Speech Recognizer
Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. We describe the idea of prosody dependent speech recognition by building a prosody dependent speech recognizer that conditions word and phoneme models on two important prosodic variables: intonational phrase bou...
متن کاملClassification of Iranian Traditional Music Dastgahs Using Features Based on Pitch Frequency
The Iranian traditional music is composed of seven majors Dastgahs: Chahargah, Homayoun, Mahour, Segah, Shour, Nava, and Rast-Panjgah. In this paper, a new algorithm for the classification of the Iranian traditional music Dastgahs based on pitch frequency is proposed. In this algorithm, the features of Lagrange coefficients of pitch logarithm (LCPL), Fuzzy similarity sets type 2 (FSST2), and th...
متن کاملRobust Controller Design for IG Driven by Variable-Speed in WECS Using μ-Synthesis
This paper presents robust controller design for a wind-driven induction generator system using structured singular value ( -synthesis) method. The controller was designed for a static synchronous compensator (STATCOM) and a variable blade pitch angle in a wind energy conversion system (WECS) in order to achieve the required voltage and mechanical power control. The results indicated that this ...
متن کاملExperiments on Chinese Speech Recog Pitch Estimation Using the M
Automatic speech recognition of a tonal and syllabic language such as Chinese Mandarin poses new challenges but also offers new opportunities. We present approaches and experimental results concerning the choice of base units for acoustic modeling, pitch estimation and how to integrate pitch estimates into the modeling framework. The experimental evaluations are carried out both on rather clean...
متن کامل